Compiling multi-tiered speech databases into the relational model: experiments with the emu system

نویسنده

  • Steve Cassidy
چکیده

The Emu speech database system enables the annotation of speech signals at many levels of detail and provides a mechanism for making links between these levels to produce a hierarchical annotation. Emu provides facilities for searching collections of these annotations according to both sequential and hierarchical criteria. The results of a search can be used to retrieve acoustic and other data stored along with the annotations. One perceived problem with the Emu system is its ability to scale to large databases containing many thousands of utterances. To address this problem we propose a method of translating an Emu database into the relational model, as used by most commercial database systems. Using a Tcl script, the Emu database is converted into a set of tables for the relational database. Queries in the Emu query syntax are translated into SQL and comparisons are made between the query processing time for Emu and the relational database. The results show a marked increase in speed for the relational system on most queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Managing speech databases with emur and the EMU-webapp

As is the nature of the discipline, a majority of speech and language researchers spend a large amount of their time acquiring and transforming data into analyzable and interpretable forms to gain a better understanding of a certain subject matter. In this paper we present a collection of tools that aid the researcher in this sometimes tedious and error-prone process. The tools presented here a...

متن کامل

Querying Databases of Annotated Speech

Annotated speech corpora are databases consisting of signal data along with time-aligned symbolic ‘transcriptions’. Such databases are typically multidimensional, heterogeneous and dynamic. These properties present a number of tough challenges for representation and query. The temporal nature of the data adds an additional layer of complexity. This paper presents and harmonises two independent ...

متن کامل

EMU-SDMS: Advanced speech database management and analysis in R

The amount and complexity of the often very specialized tools necessary for working with spoken language databases has continually evolved and grown over the years. The speech and spoken language research community is expected to be well versed in multiple software tools and have the ability to switch seamlessly between the various tools, sometimes even having to script adhoc solutions to solve...

متن کامل

Modeling Lateral Communication in Holonic Multi Agent Systems

Agents, in a multi agent system, communicate with each other through the process of exchanging messages which is called dialogue. Multi agent organization is generally used to optimize agents’ communications. Holonic organization demonstrates a self-similar recursive and hierarchical structure in which each holon may include some other holons. In a holonic system, lateral communication occurs b...

متن کامل

EMU: an Enhanced Hierarchical Speech Data Management System

EMU is a system for labelling, managing and retrieving data from speech databases such as the Australian ANDOSL database or the US TIMIT. EMU is a re-implementation of the earlier MU+ system (Harrington, Cassidy, Fletcher, and McVeigh 1993) with the aim of providing a more flexible environment. The hierarchical structures and database query facility have been generalised and the system has been...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999